Multi-Criteria-Based Strategy to Stop Active Learning for Data Annotation
نویسندگان
چکیده
In this paper, we address the issue of deciding when to stop active learning for building a labeled training corpus. Firstly, this paper presents a new stopping criterion, classification-change, which considers the potential ability of each unlabeled example on changing decision boundaries. Secondly, a multi-criteriabased combination strategy is proposed to solve the problem of predefining an appropriate threshold for each confidence-based stopping criterion, such as max-confidence, min-error, and overalluncertainty. Finally, we examine the effectiveness of these stopping criteria on uncertainty sampling and heterogeneous uncertainty sampling for active learning. Experimental results show that these stopping criteria work well on evaluation data sets, and the combination strategies outperform individual criteria.
منابع مشابه
The Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning
In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...
متن کاملMulti-criteria-based Active Learning for Named Entity Recognition
In this thesis, we propose a multi-criteria-based active learning approach and effectively apply it to the task of named entity recognition. Active learning targets to minimize the human annotation efforts to learn a model with the same performance level as supervised learning by selecting the most useful examples for labeling. To maximize the contribution of the selected examples, we consider ...
متن کاملMulti-Criteria-based Active Learning for Named Entity Recognition
In this paper, we propose a multi-criteria based active learning approach and effectively apply it to named entity recognition. Active learning targets to minimize the human annotation efforts by selecting examples for labeling. To maximize the contribution of the selected examples, we consider the multiple criteria: informativeness, representativeness and diversity and propose measures to quan...
متن کاملUsing Variance as a Stopping Criterion for Active Learning of Frame Assignment
Active learning is a promising method to reduce human’s effort for data annotation in different NLP applications. Since it is an iterative task, it should be stopped at some point which is optimum or near-optimum. In this paper we propose a novel stopping criterion for active learning of frame assignment based on the variability of the classifier’s confidence score on the unlabeled data. The im...
متن کاملCombining Active Learning and Partial Annotation for Japanese Dependency Parsing
The machine learning-based approaches that dominate natural language processing research require massive amounts of labeled training data. Active learning has the potential to substantially reduce the human effort needed to prepare this data by allowing annotators to focus on only the most informative training examples. This paper shows how active learning can be used for domain adaptation of d...
متن کامل